A two-stage complex network using cycle-consistent generative adversarial networks for speech enhancement

نویسندگان

چکیده

Cycle-consistent generative adversarial networks (CycleGAN) have shown their promising performance for speech enhancement (SE), while one intractable shortcoming of these CycleGAN-based SE systems is that the noise components propagate throughout cycle and cannot be completely eliminated. Additionally, conventional only estimate spectral magnitude, phase unaltered. Motivated by multi-stage learning concept, we propose a novel two-stage denoising system combines magnitude enhancing network subsequent complex refining in this paper. Specifically, first stage, model responsible estimating which subsequently coupled with original noisy to obtain coarsely enhanced spectrum. After that, second stage applied further suppress residual clean mapping network, pure complex-valued composed 2D convolution/deconvolution temporal-frequency attention blocks. Experimental results on two public datasets demonstrate proposed approach consistently surpasses previous one-stage CycleGANs other state-of-the-art terms various evaluation metrics, especially background suppression. • We network. decompose spectrum estimation into sub-tasks, i.e., maginitude phase. The outperforms many approaches.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

SEGAN: Speech Enhancement Generative Adversarial Network

Current speech enhancement techniques operate on the spectral domain and/or exploit some higher-level feature. The majority of them tackle a limited number of noise conditions and rely on first-order statistics. To circumvent these issues, deep networks are being increasingly used, thanks to their ability to learn complex functions from large example sets. In this work, we propose the use of ge...

متن کامل

Automatic Colorization of Grayscale Images Using Generative Adversarial Networks

Automatic colorization of gray scale images poses a unique challenge in Information Retrieval. The goal of this field is to colorize images which have lost some color channels (such as the RGB channels or the AB channels in the LAB color space) while only having the brightness channel available, which is usually the case in a vast array of old photos and portraits. Having the ability to coloriz...

متن کامل

Robust Speech Recognition Using Generative Adversarial Networks

This paper describes a general, scalable, end-to-end framework that uses the generative adversarial network (GAN) objective to enable robust speech recognition. Encoders trained with the proposed approach enjoy improved invariance by learning to map noisy audio to the same embedding space as that of clean audio. Unlike previous methods, the new framework does not rely on domain expertise or sim...

متن کامل

Language and Noise Transfer in Speech Enhancement Generative Adversarial Network

Speech enhancement deep learning systems usually require large amounts of training data to operate in broad conditions or real applications. This makes the adaptability of those systems into new, low resource environments an important topic. In this work, we present the results of adapting a speech enhancement generative adversarial network by finetuning the generator with small amounts of data...

متن کامل

Exploring Speech Enhancement with Generative Adversarial Networks for Robust Speech Recognition

We investigate the effectiveness of generative adversarial networks (GANs) for speech enhancement, in the context of improving noise robustness of automatic speech recognition (ASR) systems. Prior work [1] demonstrates that GANs can effectively suppress additive noise in raw waveform speech signals, improving perceptual quality metrics; however this technique was not justified in the context of...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Speech Communication

سال: 2021

ISSN: ['1872-7182', '0167-6393']

DOI: https://doi.org/10.1016/j.specom.2021.09.001